{
    "cells": [
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## Setting up the Working Directory\n",
                "This cell is to ensure we change the directory to anomalib source code to have access to the datasets and config files. We assume that you already went through `001_getting_started.ipynb` and install the required packages."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 1,
            "metadata": {},
            "outputs": [],
            "source": [
                "import os\n",
                "from pathlib import Path\n",
                "\n",
                "from git.repo import Repo\n",
                "\n",
                "current_directory = Path.cwd()\n",
                "if current_directory.name == \"400_openvino\":\n",
                "    # On the assumption that, the notebook is located in\n",
                "    #   ~/anomalib/notebooks/400_openvino/\n",
                "    root_directory = current_directory.parent.parent\n",
                "elif current_directory.name == \"anomalib\":\n",
                "    # This means that the notebook is run from the main anomalib directory.\n",
                "    root_directory = current_directory\n",
                "else:\n",
                "    # Otherwise, we'll need to clone the anomalib repo to the `current_directory`\n",
                "    repo = Repo.clone_from(url=\"https://github.com/openvinotoolkit/anomalib.git\", to_path=current_directory)\n",
                "    root_directory = current_directory / \"anomalib\"\n",
                "\n",
                "os.chdir(root_directory)"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "# int-8 Model Quantization via NNCF\n",
                "It is possible to use Neural Network Compression Framework ([NNCF](https://github.com/openvinotoolkit/nncf)) with anomalib for inference optimization in [OpenVINO](https://docs.openvino.ai/latest/index.html) with minimal accuracy drop.\n",
                "\n",
                "This notebook demonstrates how NNCF is enabled in anomalib to optimize the model for inference. Before diving into the details, let's first train a model using the standard Torch training loop.\n",
                "\n",
                "## 1. Standard Training without NNCF\n",
                "To train model without NNCF, we use the standard training loop. We use the same training loop as in the [Getting Started Notebook](https://github.com/openvinotoolkit/anomalib/blob/main/notebooks/000_getting_started/001_getting_started.ipynb)."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "from anomalib.engine import Engine\n",
                "\n",
                "from anomalib.config import get_configurable_parameters\n",
                "from anomalib.data import get_datamodule\n",
                "from anomalib.models import get_model\n",
                "from anomalib.utils.callbacks import get_callbacks"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Configuration\n",
                "Similar to the [Getting Started Notebook](https://github.com/openvinotoolkit/anomalib/blob/main/notebooks/000_getting_started/001_getting_started.ipynb), we will start with the [PADIM](https://github.com/openvinotoolkit/anomalib/tree/main/anomalib/models/padim) model. We follow the standard training loop, where we first import the config file, with which we import datamodule, model, callbacks and trainer, respectively."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "MODEL = \"padim\"  # 'padim', 'stfpm'\n",
                "CONFIG_PATH = f\"src/anomalib/models/{MODEL}/config.yaml\"\n",
                "\n",
                "config = get_configurable_parameters(config_path=CONFIG_PATH)"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "Note that we will run OpenVINO's `benchmark_app` on the model, which requires the model to be in the ONNX format. Therefore, we set the `export_type` flag to `onnx` in the `optimization` [config file](https://github.com/openvinotoolkit/anomalib/blob/main/anomalib/models/padim/config.yaml#L61). Let's check the current config:"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 4,
            "metadata": {},
            "outputs": [
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": [
                        "{'export_type': None}\n"
                    ]
                }
            ],
            "source": [
                "print(config[\"optimization\"])"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "As mentioned above, we need to explicitly state that we want onnx export mode to be able to run `benchmark_app` to compute the throughput and latency of the model."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 5,
            "metadata": {},
            "outputs": [],
            "source": [
                "config[\"optimization\"][\"export_type\"] = \"onnx\""
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "datamodule = get_datamodule(config)\n",
                "model = get_model(config)\n",
                "callbacks = get_callbacks(config)\n",
                "\n",
                "# start training\n",
                "engine = Engine(**config.trainer, callbacks=callbacks)\n",
                "engine.fit(model=model, datamodule=datamodule)\n",
                "fp32_results = engine.test(model=model, datamodule=datamodule)"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## 2. Training with NNCF\n",
                "The above results indicate the performance of the standard fp32 model. We will now use NNCF to optimize the model for inference using int8 quantization. Note, that NNCF and Anomalib integration is still in the experimental stage and currently only Padim and STFPM models are optimized. It is likely that other models would work with NNCF; however, we do not guarantee that the accuracy will not drop significantly.\n",
                "\n",
                "### 2.1. Padim Model\n",
                "To optimize the Padim model for inference, we need to add NNCF configurations to the `optimization` section of the [config file](https://github.com/openvinotoolkit/anomalib/blob/main/anomalib/models/padim/config.yaml#L60). The following configurations are added to the config file:\n",
                "\n",
                "```yaml\n",
                "optimization:\n",
                "  export_type: null # options: torch, onnx, openvino\n",
                "  nncf:\n",
                "    apply: true\n",
                "    input_info:\n",
                "      sample_size: [1, 3, 256, 256]\n",
                "    compression:\n",
                "      algorithm: quantization\n",
                "      preset: mixed\n",
                "      initializer:\n",
                "        range:\n",
                "          num_init_samples: 250\n",
                "        batchnorm_adaptation:\n",
                "          num_bn_adaptation_samples: 250\n",
                "```\n",
                "\n",
                "After updating the `config.yaml` file, `config` could be reloaded and the model could be trained with NNCF enabled. Alternatively, we could manually add these NNCF settings to the `config` dictionary here. Since we already have the `config` dictionary, we will choose the latter option, and manually add the NNCF configs."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 7,
            "metadata": {},
            "outputs": [],
            "source": [
                "config[\"optimization\"][\"nncf\"] = {\n",
                "    \"apply\": True,\n",
                "    \"input_info\": {\"sample_size\": [1, 3, 256, 256]},\n",
                "    \"compression\": {\n",
                "        \"algorithm\": \"quantization\",\n",
                "        \"preset\": \"mixed\",\n",
                "        \"initializer\": {\"range\": {\"num_init_samples\": 250}, \"batchnorm_adaptation\": {\"num_bn_adaptation_samples\": 250}},\n",
                "    },\n",
                "}"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "Now that we have updated the config with the NNCF settings, we could train and tests the NNCF model that will be optimized via int8 quantization."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "datamodule = get_datamodule(config)\n",
                "model = get_model(config)\n",
                "callbacks = get_callbacks(config)\n",
                "\n",
                "# start training\n",
                "engine = Engine(**config.trainer, callbacks=callbacks)\n",
                "engine.fit(model=model, datamodule=datamodule)\n",
                "int8_results = engine.test(model=model, datamodule=datamodule)"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 9,
            "metadata": {},
            "outputs": [
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": [
                        "[{'pixel_F1Score': 0.7241847515106201, 'pixel_AUROC': 0.9831862449645996, 'image_F1Score': 0.9921259880065918, 'image_AUROC': 0.9960317611694336}]\n"
                    ]
                }
            ],
            "source": [
                "print(fp32_results)"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 10,
            "metadata": {},
            "outputs": [
                {
                    "name": "stdout",
                    "output_type": "stream",
                    "text": [
                        "[{'pixel_F1Score': 0.7164928913116455, 'pixel_AUROC': 0.9731722474098206, 'image_F1Score': 0.9841269850730896, 'image_AUROC': 0.9904761910438538}]\n"
                    ]
                }
            ],
            "source": [
                "print(int8_results)"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "As can be seen above, there is a slight performance drop in the accuracy of the model. However, the model is now ready to be optimized for inference. We could now use `benchmark_app` to compute the throughput and latency of the fp32 and int8 models and compare the results."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# Check the throughput and latency performance of the fp32 model.\n",
                "!benchmark_app -m results/padim/mvtec/bottle/run/onnx/model.onnx -t 10"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# Check the throughput and latency performance of the int8 model.\n",
                "!benchmark_app -m results/padim/mvtec/bottle/run/compressed/model_nncf.onnx -t 10"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "We have observed approximately 30.1% performance improvement in the throughput and latency of the int8 model compared to the fp32 model. This is a significant performance improvement, which could be achieved with 6.94% drop pixel F1-Score.\n",
                "\n",
                "### 2.2. STFPM Model\n",
                "Same steps in 2.1 could be followed to optimize the STFPM model for inference. The only difference is that the `config.yaml` file for STFPM model, located [here](https://github.com/openvinotoolkit/anomalib/blob/main/anomalib/models/stfpm/config.yaml#L67), should be updated with the following:\n",
                "\n",
                "```yaml\n",
                "optimization:\n",
                "  export_type: null # options: torch, onnx, openvino\n",
                "  nncf:\n",
                "    apply: true\n",
                "    input_info:\n",
                "      sample_size: [1, 3, 256, 256]\n",
                "    compression:\n",
                "      algorithm: quantization\n",
                "      preset: mixed\n",
                "      initializer:\n",
                "        range:\n",
                "          num_init_samples: 250\n",
                "        batchnorm_adaptation:\n",
                "          num_bn_adaptation_samples: 250\n",
                "      ignored_scopes:\n",
                "      - \"{re}.*__pow__.*\"\n",
                "    update_config:\n",
                "      hyperparameter_search:\n",
                "        parameters:\n",
                "          lr:\n",
                "            min: 1e-4\n",
                "            max: 1e-2\n",
                "```\n",
                "This is to ensure that we achieve the best accuracy vs throughput trade-off."
            ]
        }
    ],
    "metadata": {
        "kernelspec": {
            "display_name": "anomalib",
            "language": "python",
            "name": "python3"
        },
        "language_info": {
            "codemirror_mode": {
                "name": "ipython",
                "version": 3
            },
            "file_extension": ".py",
            "mimetype": "text/x-python",
            "name": "python",
            "nbconvert_exporter": "python",
            "pygments_lexer": "ipython3",
            "version": "3.8.13"
        },
        "orig_nbformat": 4,
        "vscode": {
            "interpreter": {
                "hash": "ae223df28f60859a2f400fae8b3a1034248e0a469f5599fd9a89c32908ed7a84"
            }
        }
    },
    "nbformat": 4,
    "nbformat_minor": 2
}
